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Scientific/Technical  Approaches 


Project  Objectives 

•  Develop  technologies  to  identify  and  remediate 
attacking  networks  (e.g.,  botnets). 

•  Disrupt  the  botnet  command  and  control  (C&C). 
Without  C&C,  a  botnet  is  an  unorganized 
infection. 

•  Detection  techniques  must  be  evasion-resistant 
and  not  dependant  on  one  given  protocol. 


Accomplishments 


•  DNS-Based  Detection :  Using  DDNS  and  high-speed  DNS 
monitoring,  we  will  detect  botnet  activity,  regardless  of  the 
underlying  C&C  protocol. 

•  Flow/traffic-Based  Detection:  We  will  use  flow-based 
anomaly  detection  techniques  for  evasive  botnets  that  don’t 
even  use  DNS. 

•  Response :  We  will  use  proxynets,  blackholes,  sinkholes 
and  other  technologies  to  disrupt  the  botnet  C&C,  and 
enable  traditional  response  techniques. 


•  Developed  and  deployed  a  set  of  DNS  based  monitoring 
and  surveying  systems  for  Internet-scale  botnet  detection 
and  situation  awareness. 

•  Developed  a  family  of  botnet  detection  systems  for 
enterprise  networks. 

•  On-going  and  successful  technology  transfer:  Damballa. 
•New  project  from  DHS:  prototype  and  deployment. 
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Next-Generation  Botnet  Detection  and  Response 


•  Highlights 

-  Dynamic  DNS  monitoring  heuristics  to  identify  domains  used  for 
botnet  command  and  control 

-  Surveying  method  for  (misconfigured/malicious)  Open  Recursive 
DNS  servers  on  the  Internet 

-  Anomaly  detection  algorithms  for  Recursive  DNS  servers  at  ISPs 
and  enterprise  networks 

-  Botnet  detection  systems  for  enterprise  networks 

-  BotHunter,  BotSniffer,  BotMiner,  and  BotProbe 

-  Related  efforts 

-  CyberTA  (SRI),  new  DHS  project 

-  Formed  a  start-up  company  Damballa,  Inc.  to  deliver  anti-botnet 
technologies  to  government  and  enterprise  customers. 
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We  highlight  the  BotSniffer  system  in  this 
report.  We  provide  a  list  of  publications  at  the 
end  of  this  report.  These  papers  describe  the 
technologies  developed  in  this  project  in  great 
details. 


BotSniffer:  Detecting  Botnet  C&C  in 

Enterprise  Netowrks 
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Bot  Master 
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What's  current  command? 


c&c 


Server 

fcjHTTP  )  g0t  Master 


n' What's  current  command?^  J® 


command 


(II)  C&C:  Pull  style 
(a)  Two  styles  of  botnet  C&C 


xrt  IRC  C&C  server 

■  ■  ■ 

.login  us  ^password _ _ _ —  —  —  - 


Password  accepted 

.bot.about  _ _ — - 

- - —  ' 

Phatbot3  (Alpha  1)  "Release11  on  "Win32" 

- Pi 

.bot-sysinfo  _  - - - 

. _ — — 1 

cpu:  ...  ram: ...  os: . 

.scan.start  _ _ _ — - — 

. — 

- - 

CSendFile(0x46E46A28h):  Transfer  to  X.X.X.X  finished 


(b)  An  IRC- based  C&C  communication 
example 
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Botnet  C&C  Detection 


•  C&C  is  essential  to  a  botnet 

-  Without  C&C,  bots  are  just  discrete,  unorganized  infections 

•  C&C  detection  is  important 

-  Relatively  stable  and  unlikely  to  change  within  botnets 

-  Reveal  C&C  server  and  local  victims 

-  The  weakest  link  if  C&C  server  is  detected  and  can  be  taken  down 

•  C&C  detection  is  hard 

-  Use  existing  common  protocol  instead  of  new  one 

-  Low  traffic  rate 

-  Obscure/obfuscated  communication 
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Botnet  C&C:  Spatial-Temporal  Correlation  and 

Similarity 


(a)  Message  response  crowd  (b)  Aclivily  response  crowd 
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BotSniffer  Architecture 
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Correlation  Engine 


•  Group  clients  according  to  their  destination  IP  and  Port 
pair  (HTTP/IRC  connection  record) 

•  Perform  a  group  analysis  on  spatial-temporal  correlation 
and  similarity  property 

•  Currently 

-  Response-Crowd-Density-Check  algorithm  for  group  activity 
response  analysis 

-  Response-Crowd-Homogeneity-Check  algorithm  for  group 
message  response  analysis. 
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Response-Crowd-Density-Check  Algorithm 


•  Response  crowd 

—  a  set  of  clients  having  (message/activity)  response  behavior 

•  Dense  crowd 

-  the  fraction  of  the  number  of  such  message/activity  response 
clients  in  the  crowd  over  the  size  of  the  group  is  larger  than  a 
threshold  (e.g.,  0.5) 

•  Example:  5  clients  connected  to  the  same  IRC/HTTP  server,  and  all  of 
them  scan  at  similar  time  (or  send  messages  at  similar  time) 


•  Sequential  Probability  Ratio  Testing 
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Sequential  Probability  Ratio  Testing  (SPRT) 


•  Each  round  (a  time  window),  observe  whether  current  crowd  is  dense  or 
not  (Y) 

-  Hypothesis 

-  Pr(Y=l  |H1)  very  high  (for  botnet) 

-  Pr(Y=l  |H0)  very  low  (for  normal  user) 

•  Make  a  random  walk  according  to  the  observation  Y 

•  After  several  rounds,  we  may  reach  a  decision  (which  hypothesis  is  more 
likely,  HI  or  HO) 


•  Also  called  TRW  (Threshold  Random  Walk) 

•  Bounded  false  positive  and  false  negative  rate  (as  desired),  and  usually 
needs  only  a  few  rounds 


*  _  .  PnVl V'w|JJl)  _  fL  Fr(V;|#i) 

!  "  ”  n  Pt{Y± Yn\H0)  ~  *  YltPrmiHo) 


Pr(Yt\Hi) 

FrtMHo) 
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Response-Crowd-Homogeneity-Check 

Algorithm 


A  homogeneous  response  crowd 

-  most  of  the  members  have  very  similar  responses 
Similarity  is  defined 

-  Message  response 

-  Similar  payload  (Dice  distance) 

2 1  ngrams  (A" )  Pi  ngr  ams  ( Y ) 


Dice{X\Y)  = 


ngr  ams  (  X)  |  +  \ngra/rns(Y)  \ 


—  Activity  response 

-  Scan  same  ports  (subnet) 

-  Download  same  binary 

-  Send  similar  spam 
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Experiments 
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trace  size 

duration 

Pkt 

TCP  flow 
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FP 
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54MB 

17 1  h 

180,421 

10,530 

2,057 

0 

IRC-2 

14MB 

433h 

33,320 

4,061 

335 

0 

IRC- 3 

516MB 

1 ,626h 

2,073,587 

4,577 

563 

5 

IRC-4 

620MB 

673h 

4,071,707 

24,837 

228 

2 

IRC- 5 

3MB 

30h 

10,100 

24 

17 

0 

irc-6 

1 55MB 

I68h 

1,033,318 

6,081 

85 

1 

irc-7 

60MB 

420h 

303,185 

717 

209 

0 

All-1 

4.2GB 

10m 

4,706,803 

14,475 

1,625 

0 

All-2 

6.2GB 

10m 

6,760,015 

28,350 

1,576 

0 

All-3 

7.6GB 

Ih 

16,523,826 

33 1 ,706 

1,717 

0 

All-4 

15GB 

l.4h 

21,312,841 

110,852 

2,140 

0 
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Experiments  (cont.) 
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Discussion  &  Future  Work 


•  Evading  HTTP  autocorrelation  by  using  very  long 
period 

•  Evasion  using  other  protocols  or  self-designed 
protocols 

•  Effect  of  encryption 

•  Evasion  by  using  random  delay/period,  injecting 
random  noise,  injecting  random  garbage  in  the  packet 

•  A  new  system  under  development  will  address  these 
problems 
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Project  Statistics  and  Summary 


Students  supported: 

1  undergraduate  student 
3  graduate  students 

2  PhDs  expected  May/ August  2008 

Publications: 

-  5  Conference  papers 

-  1  book  chapter 
Technology  Transitions: 

-  4  Patents  (disclosures) 

-  1  start-up:  Damballa,  Inc. 

-  1  DHS  Type  II  project 
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